Cushing
Performance of weakly-supervised electronic health record-based phenotyping methods in rare-outcome settings
Hong, Yunjing, Nelson, Jennifer C., Williamson, Brian D.
Accurately identifying patients with specific medical conditions is a key challenge when using clinical data from electronic health records. Our objective was to comprehensively assess when weakly-supervised prediction methods, which use silver-standard labels (proxy measures of the true outcome) rather than gold-standard true labels, perform well in rare-outcome settings like vaccine safety studies. We compared three methods (PheNorm, MAP, and sureLDA) that combine structured features and features derived from clinical text using natural language processing, through an extensive simulation study with data-generating mechanisms ranging from simple to complex, varying outcome rates, and varying degrees of informative silver labels. We also considered using predicted probabilities to design a chart review validation study. No single method dominated the other across all prediction performance metrics. Probability-guided sampling selected a cohort enriched for patients with more mentions of important concepts in chart notes. SureLDA, the most complex of the three algorithms we considered, often performed well in simulations. Performance depended greatly on selected tuning parameters. Care should be taken when using weakly-supervised prediction methods in rare-outcome settings, particularly if the probabilities will be used in downstream analysis, but these methods can work well when silver labels are strong predictors of true outcomes.
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Alaska (0.04)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Health Care Technology > Medical Record (1.00)
Sequential Audit Sampling with Statistical Guarantees
Financial statement auditing is conducted under a risk-based evidence approach to obtain reasonable assurance. In practice, auditors often perform additional sampling or related procedures when an initial sample does not provide a sufficient basis for a conclusion. Across jurisdictions, current standards and practice manuals acknowledge such extensions, while the statistical design of sequential audit procedures has not been fully explored. This study formulates audit sampling with additional, sequentially collected items as a sequential testing problem for a finite population under sampling without replacement. We define null and alternative hypotheses in terms of a tolerable deviation rate, specify stopping and decision rules, and formulate exact sequential boundary conditions in terms of finite-population error probabilities. For practical implementation, we calibrate those boundaries by Monte Carlo simulation at least-favorable deviation rates. The exact design yields ex ante control of decision error probabilities, and the simulation-based implementation approximates that design while allowing the computation of expected stopping times. The framework is most naturally suited to attribute auditing and deviation-rate auditing, especially tests of controls, and it can be extended to one-sided, two-stage, and truncated designs.
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- Asia > Malaysia (0.04)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
- (2 more...)
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- North America > United States > New York (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- North America > United States > Arizona > Maricopa County > Tempe (0.04)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Automobiles & Trucks > Manufacturer (1.00)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- North America > United States > New York (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
Asymptotic Theory and Phase Transitions for Variable Importance in Quantile Regression Forests
Nakamura, Tomoshige, Shiraishi, Hiroshi
Quantile Regression Forests (QRF) are widely used for non-parametric conditional quantile estimation, yet statistical inference for variable importance measures remains challenging due to the non-smoothness of the loss function and the complex bias-variance trade-off. In this paper, we develop a asymptotic theory for variable importance defined as the difference in pinball loss risks. We first establish the asymptotic normality of the QRF estimator by handling the non-differentiable pinball loss via Knight's identity. Second, we uncover a "phase transition" phenomenon governed by the subsampling rate $β$ (where $s \asymp n^β$). We prove that in the bias-dominated regime ($β\ge 1/2$), which corresponds to large subsample sizes typically favored in practice to maximize predictive accuracy, standard inference breaks down as the estimator converges to a deterministic bias constant rather than a zero-mean normal distribution. Finally, we derive the explicit analytic form of this asymptotic bias and discuss the theoretical feasibility of restoring valid inference via analytic bias correction. Our results highlight a fundamental trade-off between predictive performance and inferential validity, providing a theoretical foundation for understanding the intrinsic limitations of random forest inference in high-dimensional settings.
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- Asia > Pakistan (0.04)
3D Mapping Using a Lightweight and Low-Power Monocular Camera Embedded inside a Gripper of Limbed Climbing Robots
Okawara, Taku, Nishibe, Ryo, Kasano, Mao, Uno, Kentaro, Yoshida, Kazuya
Limbed climbing robots are designed to explore challenging vertical walls, such as the skylights of the Moon and Mars. In such robots, the primary role of a hand-eye camera is to accurately estimate 3D positions of graspable points (i.e., convex terrain surfaces) thanks to its close-up views. While conventional climbing robots often employ RGB-D cameras as hand-eye cameras to facilitate straightforward 3D terrain mapping and graspable point detection, RGB-D cameras are large and consume considerable power. This work presents a 3D terrain mapping system designed for space exploration using limbed climbing robots equipped with a monocular hand-eye camera. Compared to RGB-D cameras, monocular cameras are more lightweight, compact structures, and have lower power consumption. Although monocular SLAM can be used to construct 3D maps, it suffers from scale ambiguity. To address this limitation, we propose a SLAM method that fuses monocular visual constraints with limb forward kinematics. The proposed method jointly estimates time-series gripper poses and the global metric scale of the 3D map based on factor graph optimization. We validate the proposed framework through both physics-based simulations and real-world experiments. The results demonstrate that our framework constructs a metrically scaled 3D terrain map in real-time and enables autonomous grasping of convex terrain surfaces using a monocular hand-eye camera, without relying on RGB-D cameras. Our method contributes to scalable and energy-efficient perception for future space missions involving limbed climbing robots. See the video summary here: https://youtu.be/fMBrrVNKJfc
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- Asia > Japan > Kyūshū & Okinawa > Kyūshū > Kumamoto Prefecture > Kumamoto (0.04)
- Asia > Japan > Honshū > Tōhoku > Miyagi Prefecture > Sendai (0.04)
- Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)
Proceedings of the 2025 XCSP3 Competition
Audemard, Gilles, Lecoutre, Christophe, Lonca, Emmanuel
Competition 2025, following those published in 2022 [2], 2023 [3], and 2024 [4]. The website containing all detailed results of this international competition is available at: https://www.cril.univ-artois.fr/XCSP25 The organization of this 2025 competition involved the following tasks: adjusting general details (dates, tracks, .. . These instances can be found in this archive. Some (usually minor) differences may exist when compiling the models presented in this document and those that can be found in this archive. Remember that the complete description, Version 3.2, of the format (XCSP For the 2025 competition, 33 problems have been selected. They are succinctly presented in Table 1.1. For each problem, the type of the involved (global) constraints is indicated. At this point, do note that making a good selection of problems/instances is a difficult task. When table is followed by (), it means that starred tables are involved. It is always interesting to see how constraint solvers behave when the instances of a problem become harder and harder. This is what we call the scaling behavior of solvers.
- Europe > Austria > Styria > Graz (0.04)
- North America > United States > Kansas (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- (42 more...)
- Information Technology > Security & Privacy (0.46)
- Leisure & Entertainment > Sports (0.45)
Sparse and nonparametric estimation of equations governing dynamical systems with applications to biology
Pillonetto, G., Giaretta, A., Aravkin, A., Bisiacco, M., Elston, T.
Data-driven discovery of model equations is a powerful approach for understanding the behavior of dynamical systems in many scientific fields. In particular, the ability to learn mathematical models from data would benefit systems biology, where the complex nature of these systems often makes a bottom up approach to modeling unfeasible. In recent years, sparse estimation techniques have gained prominence in system identification, primarily using parametric paradigms to efficiently capture system dynamics with minimal model complexity. In particular, the Sindy algorithm has successfully used sparsity to estimate nonlinear systems by extracting from a library of functions only a few key terms needed to capture the dynamics of these systems. However, parametric models often fall short in accurately representing certain nonlinearities inherent in complex systems. To address this limitation, we introduce a novel framework that integrates sparse parametric estimation with nonparametric techniques. It captures nonlinearities that Sindy cannot describe without requiring a priori information about their functional form. That is, without expanding the library of functions to include the one that is trying to be discovered. We illustrate our approach on several examples related to estimation of complex biological phenomena.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (2 more...)
- Health & Medicine (1.00)
- Education > Curriculum > Subject-Specific Education (0.34)
Token Distillation: Attention-aware Input Embeddings For New Tokens
Dobler, Konstantin, Elliott, Desmond, de Melo, Gerard
New tokens can be added to solve this problem, when coupled with a good initialization for their new embeddings. This excessive tokenization not only leads to reduced performance on downstream tasks (Rust et al., 2021; Ali et al., 2024) but also increases the computational Although adding new tokens to a model's vocabulary can reduce over-tokenization, it Whenever we wish to add a new token to a pretrained model's vocabulary, this new token may The semantics of a word composed of multiple subtokens will largely not be stored in their raw input embeddings at all - but rather constructed by the Transformer's attention/feed-forward layer stack during contextualization (Elhage et al., 2022; Lad et al., 2024; We demonstrate the efficacy of our method, dubbed "Token Distillation", in Section 5. We illustrate Our experimental setup is detailed in Section 4. In summary, our contributions are as follows. We motivate our proposed method by describing the fundamental limitations of current embedding initialization methods and empirically verify our claims. Most state-of-the-art Large Language Models (LLMs) are trained using a static tokenizer, usually derived by a byte-pair encoding scheme before model training (Sennrich et al., 2016). Furthermore, Lesci et al. (2025) show that in practice, words which are not a single A solution to this problem is to modify the existing vocabulary to suit the specific needs.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- (16 more...)
- Health & Medicine > Therapeutic Area (1.00)
- Education (0.93)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Motivating Students' Self-study with Goal Reminder and Emotional Support
Cho, Hyung Chan, Cha, Go-Eum, Liu, Yanfu, Jeong, Sooyeon
Abstract-- While the efficacy of social robots in supporting people in learning tasks has been extensively investigated, their potential impact in assisting students in self-studying contexts has not been investigated much. This study explores how a social robot can act as a peer study companion for college students during self-study tasks by delivering task-oriented goal reminder and positive emotional support. We conducted an exploratory Wizard-of-Oz study to explore how these robotic support behaviors impacted students' perceived focus, productivity, and engagement in comparison to a robot that only provided physical presence (control). Our study results suggest that participants in the goal reminder and the emotional support conditions reported greater ease of use, with the goal reminder condition additionally showing a higher willingness to use the robot in future study sessions. Participants' satisfaction with the robot was correlated with their perception of the robot as a social other, and this perception was found to be a predictor for their level of goal achievement in the self-study task. These findings highlight the potential of socially assistive robots to support self-study through both functional and emotional engagement. Peer relationships in educational settings play a crucial role in generating relatedness and support that are influential in fostering academic success [1]-[4]. Peer support is shown to positively impact students' learning by fostering a sense of connectedness, which enhances productivity, academic performance, and study well-being [1], [3], [5], [6].
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)